Project - Plants Seedling Classification


Background & Context:

Objective:

Data Description:

This dataset contains images of unique plants belonging to 12 different species. The data file names are:

Due to the large volume of data, the images were converted to numpy arrays and stored in images.npy file and the corresponding labels are also put into Labels.csv so that you can work on the data/project seamlessly without having to worry about the high data volume.


List of Plant Species

The dataset comprises of 12 plant species.


Importing required Libraries/Packages


Loading the Dataset



Overview of the dataset


Distribution of each Class

Observations:

Plotting few random Images and its corresponding labels

Treating images based on BGR vs RGB


Observations:

Exploratory Data Analysis


Observations:

Inferences:

Summary of Dataset & EDA

Data Preprocessing


Visualizing images using Gaussian Blur

Observations:

Resizing images

Data Processing for modeling


Splitting the dataset

Making the data compatible:

Encoding the target labels

Data Normalization

Checking the shape of the data

Initializing Class weights to balance the Classes

Data Preprocessing Summary:


We will now apply modeling techniques to determine the best model to identify the plant species

Model Building


Model 0 - Simple Convolutional Neural Network (CNN)

Observations:

Fitting the model on the train data

Model Evaluation

Observations:

Model Evaluation on Training dataset

Observations

Model Evaluation on Validation dataset

Observations

Model Evaluation on Test dataset

Observations:

Visualizing the Predictions:

Inference:


Model 1 - Convolutional Neural Network (CNN) with addiitonal Layers

Observations:

Fitting the model on the train data

Model Evaluation

Observations:

Model Evaluation on Training dataset

Observations

Model Evaluation on Validation dataset

Observations

Model Evaluation on Test dataset

Observations:

Generating the predictions using test data

Inference:

Model 2 - CNN with Data Augumentation


Creating a CNN model sequentially by using Data Augumentation to check if we can improve the model performance and metrics further

Observations:

Fitting the model on the train data

Model Evaluation

Observations:

Model Evaluation on Training dataset

Observations

Model Evaluation on Validation dataset

Observations

Model Evaluation on Test dataset

Observations

Visualizing the Predictions:

Inference:

Model # 3 - CNN with Transfer Learning using VGG16


Using VGG16 model for the Transfer Learing

Observations:

Observations:

Fitting the model on the train data

Model Evaluation

Observations:

Model Evaluation on Training dataset

Observations

Model Evaluation on Validation dataset

Observations

Model Evaluation on Test dataset

Observations

Visualizing the Predictions:

Inference:

Conclusion

Inferences:

Model 2 has better accuracy variance and chance of predicting the results more accurately. We can use this model to classify the plant species